# You can keep track of all the data analysis steps
2 + 2 + 3 # step 1
#> [1] 7
log(2 + 2 + 3) # step 2
#> [1] 1.94591Introduction to R
Marine Ecosystem Dynamics
Plan for today’s lecture
- The
Rsyntax - The
R studiosoftware - Variables, functions and vectors
- Importing data using the
readrpackage
Why using R?
Pro
- Free
- Open source
- Reproducible science
Cons
- Scary
- Syntax
# This can be scary
library(ggplot2) ; library(dplyr) ; set.seed(123)
tibble(Month = sample(month.abb, 100, replace = TRUE),
Genus = sample(c("Acartia", "Temora", "Centropages", "Pseudocalanus"), 100, replace = T),
Abundance = rnorm(100,12,7)) |>
group_by(Month, Genus) |>
summarise(Avg_abundance = mean(Abundance, na.rm = T)) |>
ggplot(aes(x = Genus, y = Avg_abundance)) +
geom_boxplot()R is open and free
This means that people have worked on it and created tools and functions that everyone can use !
- R base functions (already implemented and loaded when starting a new session): e.g.,
plot(),+,-,sin() - Additional functions (we need to load): e.g.
ggplot(),select(), …
How to install and load packages
- A package need to be installed only once
- To use functions within a package call it using
library()
install.packages("PackageName")
library(PackageName)R syntax
R as a calculator
- R can resolve “basic” operation
2 + 2
#> [1] 4
3 * 4
#> [1] 12
(5 + 2) * (4 - 1)
#> [1] 21- And more complex operation
sin(60)
#> [1] -0.3048106
log(10)
#> [1] 2.302585Variables
Variables in R can be of several types :
- Logical:
TRUEorFALSE - Numeric:
3.1or4 - Character:
Example
To assign a value to a variable, several options exist
<-e.g.a <- 2->e.g.2 -> aassign()e.g.assign("a", 2)=e.g.a = 2
. . .
Assigning the same value for multiple variable
variable_4 <- variable_5 <- variable_6 <- "Value"Functions
- All functions have the same structure but the number of argument may change
function_name(argument1, ...)
- To know what arguments are needed, we can always refer to the manuals using
?before the function
?plot()If you want to go a step further
- You can define your own functions:
. . .
- And compare if this is equal to the base R functions:
my_addition(parameter_1 = 1, parameter_2 = 2) == 1 + 2
#> [1] TRUE- Note the logical operations are written as follow:
- is equal:
== - is different:
!=
- is equal:
Vectors
- Vectors can be created using different functions
. . .
- R works with vectors from which we can do our calculation
Importing data
- The best and most efficient way to import data is to use the
readrpackages
- The main function has this form:
read_*where*can be:csv- comma-separated valuestsv- tab-separated valuescsv2- semicolon-separated values with , as the decimal markdelim- delimited files
Example
library(readr)
#> Warning: package 'readr' was built under R version 4.1.2
Example_1 <- readr::read_csv("./../../assets/data/Example_1.csv")
#> Rows: 100 Columns: 3
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): Month, Genus
#> dbl (1): Abundance
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(Example_1)
#> # A tibble: 6 × 3
#> Month Genus Abundance
#> <chr> <chr> <dbl>
#> 1 Dec Centropages -0.552
#> 2 Apr Centropages 12.5
#> 3 Feb Centropages 18.4
#> 4 Sep Acartia 25.6
#> 5 Mar Pseudocalanus 9.70
#> 6 Jul Temora 8.90. . .
tail(Example_1)
#> # A tibble: 6 × 3
#> Month Genus Abundance
#> <chr> <chr> <dbl>
#> 1 Jan Pseudocalanus 22.7
#> 2 Feb Acartia 27.6
#> 3 Aug Acartia 7.75
#> 4 Jan Centropages 17.0
#> 5 Feb Centropages 5.95
#> 6 Aug Temora 17.2Plan for tomorrow
- Introduction to
tidyverse - Pipe the data using
magrittr - Clean the data using
tidyr - Arrange the data using
dplyr - Plot using
ggplot2
Do not hesitate to use google to get help !
If you have an issue with something, you are probably not the first and someone asked a solution on a forum !